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Pre-Motor  Response  Time 
Benefits  in  Multi-Modal 

Displays 


James  L.  Merlo’ ,  P.  A.  Hancock^ 

^United  States  Military  Academy 
West  Point,  NY,  USA 

^University  of  Central  Florida 
Orlando,  FL,  USA 


ABSTRACT 

The  present  series  of  experiments  tested  the  assimilation  and  efficacy  of  purpose- 
created  tactile  messages  based  on  five  common  military  arm  and  hand  signals.  We 
compared  the  response  times  and  accuracy  rates  to  these  tactile  representations 
against  the  comparable  responses  to  equivalent  visual  representations  of  these  same 
messages.  Results  indicated  that  there  was  a  performance  benefit  for  concurrent 
message  presentations  which  showed  superior  response  times  and  improved 
accuracy  rates  when  compared  to  individual  presentations  in  either  modality.  Such 
improvement  was  identified  as  being  due  largely  to  a  reduction  in  pre-motor 
response  time  and  these  improvements  occurred  equally  in  a  military  and  non¬ 
military  population.  Results  were  not  contingent  upon  the  gender  of  the  participant. 
Potential  reasons  for  this  multi-modal  facilitation  are  discussed.  The  novel 
techniques  employed  to  measure  pre-motor  response  inform  computational  neuro- 
ergonomic  models  for  multi-modal  advantages  in  dynamic  signaling.  On  a  practical 
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level,  these  results  confirm  the  utility  of  tactile  messaging  to  augment  visual 
messaging,  especially  in  challenging  and  stressful  environments  where  visual 
messaging  may  not  always  be  feasible  or  effective. 

Keywords;  Visual  Signaling,  Tactile  Signaling,  Multi-Modal  Advantage. 


INTRODUCTION 


Humans  rely  on  their  multiple  sensory  systems  to  continually  integrate  the 
environmental  stimuli  around  them  in  order  to  build  their  perception  of  the  world  in 
which  they  live.  While  each  sense  is,  in  itself,  remarkably  adept  at  detection  it  is  the 
combination  and  integration  of  these  disparate  sensory  inputs  which  provide  the 
rich  tapestry  of  spatial,  temporal,  and  object  information  on  which  humans  rely  to 
survive  and  thrive.  The  cross-modal  fusion  of  these  information  sources  is  often 
more  beneficial  than  simply  increasing  information  from  only  one  sensory 
modality.  For  example,  Hillis,  Ernst,  Banks  and  Landy  (2002)  found  that  when 
combined,  the  value  of  multiple  visual  cues  (e.g.,  disparity  and  texture  gradients) 
did  not  produce  as  accurate  performance  as  when  visual  and  tactile  cues  were 
provided  in  an  object  property  discrimination  task.  Comparing  performance  within 
the  same  modality  versus  combinations  of  two  or  more  different  modalities 
illustrates  that  information  loss  can  occur  during  intra-modal  presentations  that  does 
not  occur  with  the  fusion  across  different  modalities.  In  the  specific  case  of  tactile 
and  visual  information  there  seems  to  be  a  highly  efficient  integration  of  the  two 
sources  (Ernst  &  Banks,  2002).  This  integration  is  especially  beneficial  when  the 
cross-modal  cues  are  congment  and  match  the  top  down  expectancies  generated  by 
past  experience. 

Humans  not  only  rely  on  their  multiple  sensory  capacities  to  integrate  different 
forms  of  stimuli,  they  also  use  these  multiple  sources  to  aid  them  in  the  initial 
process  of  orientation  and  the  subsequent  focus  of  their  attention  in  space  and  time. 
When  an  individual  directs  their  attention,  regardless  of  the  primary  modality  used 
in  the  process  of  detection,  the  other  modalities  are  also  frequently  directed  toward 
that  same  location.  Indeed,  it  is  the  subject  of  an  on-going  debate  as  to  the  degree  to 
which  such  orientation  of  attention  is  a  multi-sensory  constraction  (Spence  & 
Driver,  2004)  versus  an  over-dominantly  visual  process  (Posner,  Nissen,  &  Klein, 
1976).  In  part,  this  issue  can  be  approached  from  a  neuro-physiological  perspective. 
For  example,  Stein  and  Meredith  (1993)  have  shown  that  bimodal  and  tri-modal 
neurons  have  a  stronger  cellular  response  when  animals  are  presented  with  stimuli 
from  two  sensory  modalities  as  compared  with  stimulation  from  only  one  modality. 
The  combinations  of  two  different  sensory  stimuli  have  been  shown  to  significantly 
enhance  the  responses  of  neurons  in  the  superior  colliculus  (SC)  above  those 
evoked  by  either  uni-modal  stimulus  alone.  Such  an  observation  supports  the 
conclusion  that  there  is  a  multi-sensory  link  among  individual  SC  neurons  for  cross - 
modality  attention  and  orientation  behaviors  (see  also  Wallace,  Meredith,  &  Stein, 
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1998).  Multi-modal  stimulation  in  the  world  is  not  always  presented  or  received  in  a 
congraent  spatial  and  temporal  manner.  This  problem  can  be  resolved  in  the  brain 
by  an  over  reliance  on  the  one  single  dominant  system  which  in  humans  is 
expressed  in  the  visual  modality  (see  Hancock,  2005). 

To  date,  the  exploration  into  the  cross-modal  attentional  phenomenon  has  relied 
mainly  on  simple  stimuli  to  elicit  response  (Spence  &  Walton,  2005).  Gray  and  Tan 
(2002)  used  a  number  of  tactors  (vibro-tactile  actuators)  spanning  the  length  of  the 
participant’s  arm  with  lights  mounted  on  the  individual  tactors.  Using  an 
appropriate  inter-stimulus  interval  (ISI)  and  tactor  spacing  (see  Geldard,  1982)  to 
create  the  illusion  of  movement,  either  up  or  down  the  arm,  they  found  that 
response  times  were  faster  when  the  visual  target  was  offset  in  the  same  direction  as 
the  tactile  motion  (similar  to  the  predictive  abilities  one  has  to  know  the  location  of 
an  insect  when  it  mns  up  or  down  the  arm).  Reaction  times  were  slower  when  the 
target  was  offset  in  the  direction  opposite  to  the  tactile  motion.  Such  a  finding 
supports  the  idea  that  the  cross-modal  links  between  vision  and  touch  are  updated 
dynamically  for  moving  objects  and  are  best  supported  perceptually  when  the 
stimuli  are  congruent. 

In  another  study,  Craig  (2006)  had  participants  judge  the  direction  of  apparent 
motion  by  stimulating  two  locations  sequentially  on  a  participant’s  finger  pad  using 
vibro-tactors.  Visual  trials  included  apparent  motion  induced  by  the  activation  of 
two  lights  sequentially.  Some  trials  also  were  recorded  with  both  visual  and  tactile 
stimuli  presented  together  either  congruently  or  incongruently.  When  visual  motion 
was  presented  at  the  same  time  as,  but  in  a  direction  opposite  to  tactile  motion, 
accuracy  in  judging  the  direction  of  tactile  apparent  motion  was  substantially 
reduced.  This  superior  performance  during  congruent  presentation  was  referred  to 
as  'the  congruency  effect'.  A  similar  experiment  conducted  by  Strybel  and  Vatakis 
(2004)  who  used  visual  apparent  motion  and  found  similar  effects  for  judgments  of 
auditory  apparent  motion.  Auditory  stimuli  have  also  been  shown  to  affect  the 
perceived  direction  of  tactile  apparent  motion  (see  Soto-Faraco,  Spence,  & 
Kingstone,  2004). 

While  all  of  these  experiments  with  simple  tasks  are  essential  for  understanding 
the  psychological  phenomena  being  studied,  the  extension  of  these  findings  into 
real-world  conditions  to  embrace  more  applied  stimuli  is  as  yet  largely  unexplored. 
However,  with  advancements  in  tactile  display  technology  and  innovative  signaling 
techniques,  the  importance  of  testing  systems  capable  of  assisting  actual  field 
communications  is  now  both  feasible  and  pragmatically  important.  Thus,  the 
purpose  of  the  present  experiment  was  to  examine  combinations  of  visual  and 
tactile  communications  of  real-world  operational  signals  in  order  to  evaluate  their 
efficacy  for  real-world  applications.  We  also  sought  to  distinguish  whether  multi¬ 
modal  signal  presentation  led  to  performance  advantages  under  such  circumstances. 
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EXPERIMENTAL  METHOD 

^  ■'t 

EXPERIMENTAL  PARTICIPANTS 

To  investigate  the  foregoing  propositions,  72  participants  (47  males  and  25  females) 
ranging  in  age  from  TS^to  21,  with  an  average  age  of  18.5  years,  volunteered  to 
participate.  Of  these  individuals,  31  were  from  a  large  public  southern  metropolitan 
university  and  the  remaining  41  were  from  a  United  States  Military  Academy.  The 
latter  group  had  prior' experience  with  the  visual  form  of  the  presented  military 
visual  signals,  with  the  tactile  form  of  the  signals  new  to  all.  = 

EXPERIMENTAL  MATERIALS  AND  APPARATUS 

The  vibro-tactile  actuators  (tactors)  used  in  the  present  system  were  the  model  C2, 
manufactured  by  Engineering  Acoustics,  Inc  (EAI).  They  are  acoustic  transducers 
that  displace  200-300  Hz  sinusoidal  vibrations  onto  the  skin.  Their  17  gm  mass  is 
sufficient  for  activating  the  skin’s  tactile  receptors.  The  tactile  display  itself  is  a  belt 
like  device  with  eight  vibro-tactile  actuators.  Examples  of  the  present  belt  system 
are  shown  in  Figure  1  .  When  stretched  around  the  body  and  fastened,  the  wearer 
has  an  actuator  over  the  umbilicus  and  one  centred  over  the  spine  in  the  back.  The 
other  six  actuators  are  equally  spaced  around  the  body;  three  on  each  side,  for  a 
total  of  eight  (see  also  Cholewiak,  Brill,  &  Schwab,  2004).  > 


Figure  1.  Three  tactile  displays  belt  assemblies  are  shown  above  along  with  their 
controller  box. 

The  tactors  are  operated  using  a  Tactor  Control  Unit  (TCU)  that  is  a  computer- 
controlled  driver/amplifier  system  that  switches  each  tactor  on  and  off  as  required. 
This  device  is  shown  on  the  left  side  of  the  tactile  displays  belts  in  Figure  1 .  The 
TCU  weighs  1 .2  lbs  independent  of  its  power  source  and  is  approximately  one  inch 
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thick.  This  device  connects  to  a  power  source  with  one  cable  and  to  the  display  belt 
with  the  other  and  uses  Bluetooth  technology  to  communicate  with  the  computer 
driven  interface.  Tactile;  messages  were  created  using  five  standard  Army  and 
Marine  Corps  arm  and  hand  signals  (Department  of  the  Army,  1987).  The  five 
signals  chosen  for  the  present  experiment  were,  “Attention”,  “Halt”,  “Rally”, 
“Move  Out”,  and  “Nuclear  Biological  Chemical  Event  (NBC)”.  The  tactile 
representations  of  these  signals  were  designed  in  a  collaborative  effort  involving  a 
consultant  group  of  subject  matter  experts  (SMEs)  consisting  of  former  US  Soldiers 
and  Marines,  ’  .r. 

Short  video  clips  of  a  soldier  in  uniform  performing  these  five  arm  and  hand 
signals  were  edited  to  create  the  visual  stimuli.  Careful  editing  ensured  the  timing 
of  the  arm  and  hand  signals  closely  matched  that  of  the  tactile  presentations  (see 
Figure  2).  A  Samsung  QT'Ultra  Mobile  computer  using  an  Intel  Celeron  M  ULV 
(900  MHz)  processor  with  a  7”  WVGA  (800  x  480)  liquid  crystal  display  was  used 
to  present  videos  of  the  soldier  performing  the  arm  and  hand  signals.  This  computer 
ran  a  custom  Lab  VIEW  (8.2;  National  Instmments)  application  that  presented  the 
tactile  signals  via  Bluetooth  to  the  tactor  controller  board  and  captured  all  of  the 
participant’s  responses  via  mouse  input.  Participants  wore  sound  dampening 
headphones  with  a  reduction  rating  of  11.3  dB  at  250  Hz.  This  precaution  was 
designed  to  mask  any  possible  effects  which  could  have  accrued  due  to  extraneous 
auditory  stimuli  produced  by  tactor  actuation.  As  this  is  an  issue  which  has  caused 
some  degree  of  controversy  in  the  past,  we  were  careful  to  control  for  this  potential 
artifact  in  our  own  work  (cf ,  Broadbent,  1978;  Poulton,  1977). 


Figure  2.  A  computer  screen  shot  showing  what  the  participant  viewed  as  the  signals 
were  presented.  The  participant  mouse  clicked  on  the  appropriate  signal  name  after 
each  presentation.  ; 
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EXPERIMENTAL  DESIGN  AND  PROCEDURE 

Participants  first  completed  an  informed  consent  document  in  accordance  with  the 
strictures  of  the  American  Psychological  Association  (APA).  Participants  then 
viewed  a  computer-based  tutorial  that  described  each  arm  and  hand  signal 
individually.  For  each  signal,  a  short  description  was  presented.  Participants  then 
viewed  a  video  of  a  soldier  in  uniform  performing  the  signal  followed  by  a  direct 
experience  of  its  tactile  equivalent.  Finally,  the  participants  were  able  to  play  the 
signals  concurrently  (both  visual  and  tactile  representation)  together.  Participants 
were  allowed  to  repeat  this  presentation  (i.e.,  visual,  tactile,  visual-tactile 
combined)  as  many  times  as  they  desired.  Once  the  participant  reviewed  the  five 
signals  in  the  two  presentation  styles,  a  validation  exercise  was  performed. 
Participants  had  to  correctly  identify  each  signal  twice  before  the  computer  would 
prompt  the  experimenter  that  the  participant  was  ready  to  begin. 

The  display  of  each  signal  was  presented  in  one  of  three  ways;  i)  a  visual  only 
(video  presentation  of  the  arm  and  hand  signal),  ii)  a  tactile  only  (tactile 
representation  of  the  arm  and  hand  signal),  and  iii)  both  visual  and  tactile 
simultaneously  and  congruent  (i.e.  exactly  the  same  signal  was  presented  both 
through  the  video  and  through  the  tactile  system  at  the  same  time  for  all  of  these 
trials).  The  participants  were  presented  each  signal  visually  8  times  (8  trials  x  5 
different  signals  =  40  total  trials  to  be  visual  only,  tactile  only,  and  combined  visual 
and  tactile  presentations).  This  gave  a  grand  total  of  120  trials.  The  order  that  each 
participant  performed  the  120  trials  was  completely  randomized.  The  entire 
experiment  took  less  than  an  hour  to  complete. 

Before  each  trial  began,  the  mouse  cursor  had  to  be  placed  inside  a  small  square 
in  the  center  of  the  screen  by  the  participant.  The  presentation  of  the  signal, 
regardless  of  its  modality,  started  the  timer  and  the  following  performance 
responses  were  collected:  i)  the  initial  movement  of  the  mouse,  ii)  the  latency  to 
name  the  received  signal,  iii)  the  signal  named  and  accuracy  of  that  choice.  This 
formatting  permitted  us  to  parse  the  response  into  pre-motor  time  (the  first 
movement  of  the  mouse)  and  motor  time  (the  time  to  place  the  cursor  in  the 
appropriate  response  box).  It  was  these  responses  that  were  subjected  to  analysis. 


RESULTS 

Results  were  analyzed  in  terms  of  the  speed  of  the  response  and  the  accuracy  of  the 
response  under  the  respective  conditions.  We  did  conduct  an  initial  analysis  for  any 
potential  sex  differences  but  found  no  significant  influence  upon  any  of  the 
measures  recorded.  The  subsequent  analysis  was  therefore  collapsed  across  sex.  A 
one-way  Analysis  of  Variance  (ANOVA)  was  performed  on  the  mean  response 
times  across  the  three  experimental  conditions  of  visual  presentation,  tactile 
presentation  or  visual-tactile  concurrent  and  congruent  presentation,  with  the 
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following  results;  F(2,  213)=9.37,  p<.01,  .961,  p=  1.00).  Post  hoc  analysis 

subsequently  showed  that  simultaneously  presented  congment  signals  resulted  in 
significantly  faster  response  times  than  visual  signals  presented  alone  t(71)=3.15 
^<.01,  see  Figure  3.  Also,  as  is  evident  from  this  illustration,  responses  to  the 
congruent  signals  were  also  faster  than  tactile  responses  alone  r(71)=10.29,  .p<.01. 
Additionally,  the  visual  only  presentation  of  the  signal  was  significantly  faster  than 
the  tactile  only  presentation  of  the  signal  r(71)=-4.15,;7<.01. 


1 
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*t(71)=3.15,p<.01 


Visual  Only 


Tactile  Only 
Condition 


Visual  &  Tactile 
Simultaneous 


Figure  3.  Response  Time  in  milliseconds  by  signal  presentation  condition. 

Analysis  of  the  response  accuracy  data  showed  that  there  was  a  significant 
difference  in  the  accuracy  rate  between  the  visual  and  tactile  signals  when  presented 
alone  t(71)=-7.10,p<.01.  This  difference  was  most  likely  due  to  the  extraordinarily 
high  accuracy  visual  performance  rate  since  the  military  participants  were  already 
familiar  with  and  already  had  some  previous  level  of  training  for  the  visual 
presentation  of  the  signals  and  no  prior  experience  for  the  tactile  presentations. 
There  was  also  a  significant  difference  in  the  accuracy  rate  when  responses  using 
the  tactile  modality  were  compared  to  the  concurrent  congruent  presentation  of  the 
signals,  r(71)=7.47,  /)<.01.  Here,  response  to  tactile  signals  proved  less  accurate 
than  to  the  combined  visual-tactile  presentation.  The  overall  lower  accuracy  rate  for 
the  tactile  signaling  is  again  attributed  to  the  confusion  between  the  tactile  signal 
for  ‘NBC”  and  ‘Halt”.  Analysis  without  the  “NBC”  tactile  signal  data  again 
removed  these  significant  differences  in  response  accuracy.  There  was  no 
significant  difference  between  responses  for  the  visual  only  condition  and  the 
combined  condition.  : 
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A  one-way  Analysis  of  Variance  (ANOVA)  was  performed  on  the  mean 
response  times  for  the  pre-motor  element  (the  time  that  elapsed  from  presentation  of 
the  signal  to  the  first  movement  of  the  mouse)  across  the  three  experimental 
conditions  of  visual  presentation,  tactile  presentation  or  visual-tactile  concurrent 
and  congruent  presentation.  This  analysis  produced  a  significant  effect:  F(2, 
213)=5.48,p<.01,  (ti^p=  .961,  p=  1.00).  Subsequent  pair-wise  comparisons  showed 
that  simultaneously  presented  congruent  signals  resulted  in  significantly  faster  pre¬ 
motor  response  times  than  visual  signals  presented  alone  t(71)=4.30,  p<.01,  see 
Figure  4.  Also,  as  is  evident  from  the  illustration,  the  congruent  signals  were  faster 
than  those  pre-motor  times  for  tactile  alone  t(71)=-2.9,  Additionally,  the 

visual  only  presentation  of  the  signal  was  significantly  faster  for  pre-motor  response 
than  the  tactile  only  presentation  of  the  signal  /(71)=-2.89,p><.01. 
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Figure  4.  Pre-motor  response  time  in  milliseconds  by  signal  presentation  condition. 

As  previously  stated,  the  presentation  of  the  signal,  regardless  of  its  modality, 
started  the  experimental  timer,  allowing  the  capture  of  the  latency  from  signal 
elicitation  to  the  initial  movement  of  the  mouse,  or  pre-motor  response.  The  latency 
to  name  the  received  signal,  or  in  other  words,  the  motor  time,  the  time  that  it  takes 
from  the  initial  mouse  movement  to  the  time  that  the  mouse  resides  in  the 
appropriate  response  box  was  regarded  to  as  the  motor  response  time.  There  were 
no  differences  found  across  any  of  the  experimental  conditions  for  motor  response 
latency. 

It  was  further  hypothesized  that  there  could  be  some  differences  between  the 
two  respective  groups  of  student  and  cadet  participants  due  to  their  differential 
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experience  with  the  haiid  signals  communicated.  The  participants  from  the  military 
academy  had  some  prior  experience  with  the  visual  form  of  message  while  the 
university  students  were  encountering  them  for  the  first  time.  To  a  degree,  any  such 
difference  should  have  been  mitigated  by  the  practice  given.  However,  we  chose  to 
examine  this  eventuality  analytically.  A  simple  r-test  did  distinguish  such  a 
difference  which  was  evident  in  the  pre-motor  response  time  to  the  tactile  signals 
only  (i.e.,  t(70)— 1.99,  p^-Ol  [military  cadets  =  785  ms  vs.  university  students  =  956 
ms]).  Potential  reasons  for  this  interesting  outcome  and  an  evaluation  of  all  of  the 
present  results  are  discussed  below. 

« , , 

DISCUSSION 

From  a  simple  ‘horse-race’  model  of  combinational  processing,  one  would  initially 
expect  that  the  combined  visual  and  tactile  presentation  of  consistent  signals  would 
be  equivalent  to  thej  faster  of  the  two  modalities  (i.e.,  visual  or  tactile  when 
presented  alone).  However,  this  simplistic  conception  was  not  supported  by  the 
data.  Rather,  the  combinatorial  condition  was  faster  than  either  the  visual  alone  or 
the  tactile  alone  condition.  Neither  could  enhanced  processing  speed  be  attributed  to 
a  tradeoff  of  speed  for  accuracy  since  the  combined  condition  was  significantly 
more  accurate  than  the  tactile  alone  presentation,  although  this  latter  result  might 
have  been  affected  by  a  confusion  between  two  specific  forms  of  tactile  signal. 
However,  in  general,  what  emerges  is  a  genuine  advantage  in  performance  for  the 
multi-modal  signal  presentation.  There  are  a  number  of  potential  reasons  why  this 
may  occur.  At  the  present,  we  must  postulate  some  form  of  multi-signal 
reinforcement  effect  that  derives  from  the  facilitation  due  to  cross-reinforcement  of 
sensory  signals.  A  more  realistic  source  for  the  enliancement  may  lie  in  the 
neurophysiologic  architecture  linkages  discussed  at  the  start  of  this  paper.  It  appears 
that  cross-modal  reinforcement  has  a  direct  effect  on  strength  of  synaptic 
transmission  that  is  experienced  early  in  the  stimulus  processing  sequence.  It  was  to 
explore  this  possibility  that  the  experiment  was  conducted  which  parsed  the 
response  in  order  to  isolate  motor  output  components  of  the  response  sequence. 
Here,  we  found  a  strong  confirmation  first  of  the  multi-modal  presentation 
advantage  and  second  of  the  isolation  of  that  advantage  into  the  early,  pre-motor 
stages  of  response.  At  present,  it  is  uncertain  whether  the  primary  advantage  is  to  be 
found  in  the  perceptual  recognition  phase  of  the  response  sequence  of  in  the 
decision-making  and  response  formulation  element  of  that  sequence.  However,  the 
distinction  of  such  a  difference  is  amenable  to  further  empirical  identification.  From 
the  assembly  of  present  results  it  appears  that  a  neuro-physiological  argument 
underlying  cross-modal  stimulation  provides  the  best  candidate  account  for  the 
early  advantage  offered  by  consistent  multi-modal  signaling. 
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